Picture for Sicheng Yu

Sicheng Yu

Generator-Refiner-Examiner: A Tri-Module Data Augmentation Framework for 3D Human Avatar Learning from Monocular Videos

Add code
May 22, 2026
Viaarxiv icon

MMGS: 10$\times$ Compressed 3DGS through Optimal Transport Aggregation based on Multi-view Ranking

Add code
May 19, 2026
Viaarxiv icon

Pseudo-View Enhancement via Confidence Fusion for Unposed Sparse-View Reconstruction

Add code
Feb 25, 2026
Viaarxiv icon

Rotation-free Online Handwritten Character Recognition Using Linear Recurrent Units

Add code
Feb 02, 2026
Viaarxiv icon

JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation

Add code
Dec 28, 2025
Viaarxiv icon

3D Question Answering via only 2D Vision-Language Models

Add code
May 28, 2025
Viaarxiv icon

Sparse-to-Dense: A Free Lunch for Lossless Acceleration of Video Understanding in LLMs

Add code
May 25, 2025
Viaarxiv icon

RGB-Only Gaussian Splatting SLAM for Unbounded Outdoor Scenes

Add code
Feb 21, 2025
Figure 1 for RGB-Only Gaussian Splatting SLAM for Unbounded Outdoor Scenes
Figure 2 for RGB-Only Gaussian Splatting SLAM for Unbounded Outdoor Scenes
Figure 3 for RGB-Only Gaussian Splatting SLAM for Unbounded Outdoor Scenes
Figure 4 for RGB-Only Gaussian Splatting SLAM for Unbounded Outdoor Scenes
Viaarxiv icon

Reverse Modeling in Large Language Models

Add code
Oct 13, 2024
Figure 1 for Reverse Modeling in Large Language Models
Figure 2 for Reverse Modeling in Large Language Models
Figure 3 for Reverse Modeling in Large Language Models
Figure 4 for Reverse Modeling in Large Language Models
Viaarxiv icon

Frame-Voyager: Learning to Query Frames for Video Large Language Models

Add code
Oct 07, 2024
Viaarxiv icon